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ROBUST  DISTRIBUTED  DISCRETE-TIME 


BLOCK  AND  SEQUENTIAL  DETECTION  IN  UNCERTAIN  ENVIRONMENTS 

Evaggelos  Geraniotis 

Department  of  Electrical  Engineering 
and  Systems  Research  Center 
University  of  Maryland 
College  Park,  MD  20742 

ABSTRACT 

Two  detectors  making  independent  observations  must  decide  which  one  of  two 
hypotheses  is  true.  Both  fixed-sample-size  (block)  detection  and  sequential  detection  are 
considered.  The  decisions  are  coupled  through  a  common  cost  function  which  for  tests 
with  fixed  sample  size  consists  of  the  sum  of  the  error  probabilities  while  for  sequential 
tests  it  comprises  the  sum  of  the  error  probabilities  and  the  expected  sample  sizes.  The 
probability  measures  which  govern  the  statistics  of  the  i.i.d.  observations  belong  to 
uncertainty  classes  determined  by  2-alternating  capacities. 

A  minimax  robust  (worst-case)  design  is  pursued  according  to  which  the  two  detec¬ 
tors  employ  fixed-sample-size  tests  or  sequential  probability  ratio  tests  whose  likelihood 
ratios  and  thresholds  depend  on  the  least-favorable  probability  measures  over  the  uncer¬ 
tainty  class.  For  the  aforementioned  cost  function  the  optimal  thresholds  of  the  two 
detectors  turn  out  to  be  coupled.  It  is  shown  that,  despite  the  uncertainty,  the  two 
detectors  are  thus  guaranteed  a  minimum  level  of  acceptable  performance. 
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through  National  Science  Foundation  CDR-85-00108  and  in  part  by  the  Office  of  Naval  Research  under  contract  N00014- 
86-K-0013. 
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I.  INTRODUCTION 

In  [lj  and  [2]  distributed  discrete-time  fixed-sample-size  (block)  detection  and  sequential 
detection  problems,  respectively,  were  formulated  and  solved.  The  two  detectors  collect 
independent  observations  and  make  decisions  which  are  coupled  through  a  common  cost 
function.  Then,  the  optimal  decisions  are  characterized  by  thresholds  which  are  coupled. 
The  hypothesis  testing  models  considered  in  [l]  and  [2]  assume  perfect  knowledge  of  the 
statistics  of  the  observations. 

In  this  paper  we  formulate  similar  problems  for  the  case  in  which  the  observations  are 
characterized  by  statistical  uncertainty.  Both  fixed-sample-size  (block)  and  sequential 
discrete-time  robust  detection  problems  are  considered.  Continuous-time  distributed  detec¬ 
tion  problems  with  known  statistics  are  considered  in  [3]  while  similar  problems  with  statisti¬ 
cal  uncertainty  are  treated  in  [4],  the  companion  to  this  paper. 

In  particular  the  observations  are  assumed  to  have  probability  distributions  (measures) 
which  belong  to  2-alternating  capacity  classes.  The  2-alternating  Choquet  capacities  classes 
include  several  useful  uncertainty  models  like  the  e-contaminated  class  [5],  the  total  variation 
class  [5],  the  band  class  [6]  and  the  p-point  class  [7],  which  have  been  popular  among  the  sta¬ 
tisticians. 

The  design  philosophy  that  we  pursue  for  the  problem  above  is  that  of  minimax  robust¬ 
ness.  According  to  it  a  worst-case  situation  (operational  conditions)  are  specified  in  terms  of 
a  performance  criterion  is  identified  and  the  optimal  decision  design  for  this  situation  is 
derived.  Then,  this  decision  design  is  employed  independent  of  the  actual  conditions  (which 
are  not  known,  except  for  the  fact  that  they  belong  to  some  strucured  uncertainty  class,  e.g., 
the  2-alternating  capacity  class)  and  its  performance  under  any  other  situation  is  better  than 
that  under  the  worst-case  operational  conditions. 
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Minimax  robust  signal  processing  techniques  have  received  considerable  attention  in  the 
last  15  years  (see  the  tutorial  in  [8]).  The  selection  of  uncertainty  classes  determined  by  2- 
alternating  capacities  is  motivated  by  the  fact  that  for  the  uncertainty  models  defined  in  [5]- 
[7]  the  least-favorable  operational  conditions  (here  probability  measures)  can  be  obtained  in 
closed  form  as  the  general  results  of  [9]  indicate.  In  [9]  the  performance  criterion  is  the  Bayes 
risk  or  the  error  probabilities  of  the  Neyman  Pearson  formulation  of  the  hypothesis  testing 
problem.  The  results  of  [10]  complemented  these  of  [9]  by  considering  the  Chernoff  bounds 
on  the  error  probabilities  and  by  studying  their  asymptotic  properties  in  the  presence  of 
uncertainty  within  2-alternating  capacities. 

This  paper  is  organized  as  follows.  In  Section  II  we  formulate  and  solve  the  problem  of 
robust  distributed  discrete-time  detection  with  fixed  sample  size.  Then  in  Section  III  we  treat 
the  case  of  robust  distributed  discrete-time  sequential  detection.  In  each  section  the  distri¬ 
buted  system  and  the  uncertainty  model  are  introduced  first,  then  the  case  of  detection 
under  mismatch  is  considered,  then  the  case  of  robust  detection  for  finite  sample  sizes  (which 
are  fixed  in  Section  II  and  random  variables  in  Section  III)  is  treated,  and,  finally,  asymptotic 
results  for  large  sample  sizes  are  derived. 

In  all  cases  the  robust  tests  are  based  on  the  likelihhod  ratios  between  the  least- 
favorable  measures  in  the  uncertainty  class  and  the  optimal  decision  making  of  the  two 
detectors  is  coupled  through  their  thresholds.  For  both  the  block  and  the  sequential  detec¬ 
tion  case  we  show  that  as  the  number  of  observations  increases  the  joint  cost  function 
decreases  exponentially  to  zero  despite  the  uncertainty. 


H.  MINIMAX  ROBUST  DISTRIBUTED  FIXED-SAMPLE-SIZE  DETECTION 


D.A  Problem  Formulation  and  Models  of  Uncertainty 

Consider  the  following  hypothesis  testing  problem  of  two  simple  hypotheses  H0  and  H1 
with  two  decision-makers.  Decision-maker  i  (t  =1,  2)  is  equipped  with  a  sensor  and  is 
faced  with  testing  the  hypotheses  Hi  versus  Hq. 

Hq.  Xiit  ~  mo  i,  l=l,2,...,n 

Hi-  Xiit~mlti,  /  =  l,2,...,n  (1) 

In  (1)  Xij  denotes  the  /  - th  observation  (sample),  n  is  the  number  of  samples,  and  msf  (for 
j  —  0,  1)  defined  on  the  sample  space  (f2< ,  •#<),  and  c-filed  is  the  probability  measure  which 
governs  the  statistics  of  the  i.i.d.  observations  of  the  decision  maker  i  under  hypothesis  Hj  . 
It  is  assumed  that  the  two  decision-makers  make  independent  observations  so  that  the  pro¬ 
bability  measures  (m01  and  m02)  are  mutually  independent  and  so  are  (m^,  and  m12). 

The  probability  measures  moi,  m  1{ ,  for  the  two  detectors  (»  =  1,  2),  are  only  known 
to  belong  to  uncertainty  classes  M0  l-  and  A/l  t- ,  respectively,  which  are  determined  by  the 
2-alternating  capacities  vo  i  and  vl  t-  (defined  below)  as 

Mj ,i  mjti  €  Mi  |  mjti(A)  <  vjti(A)  ,  V A  6  B,-  ,  m;,«(^)  =  j,  (2) 

where  Af,-  is  the  class  of  measures  on  (/  at  fi,  B,- )  and  j  =  0,  1  for  the  two  hypotheses. 

The  decision  making  of  detectors  1  and  2  is  coupled  through  the  following  cost  struc¬ 
ture: 

f  0  for  ^  l  —  d  2  =  h 
e  for  d  17*^2 
/  f°r  d  i  —  d2^h 


C(d1,d2]h)  =  { 


(3) 
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where  d1,d2,h  G{0,1},  e  and  /  are  non-negative  constants,  and  we  assume  that  /  >  2e  . 
Since  the  cost  [<7(1, 1;0)  —  (7(0,0;!)]  of  wrong  decisions  by  both  detectors  is  expected  to  be 
considerably  larger  than  the  cost  [O (0, 1  ;0)  =  (7(1, 0;0)  =  <7(0, 1;1)  =  (7 (1 ,0;1)]  of  a  wrong 
decision  by  one  of  the  detectors,  this  assumption  does  not  impose  a  serious  restriction  on  the 
generality  of  our  problem  formulation. 

Next  we  define  the  2-alternating  capacities: 

Definition:  A  positive  set  function  u  on  a  sample  space  f!  and  assosciated  <7-field  B  is  called 
a  2-alternating  capacity  if  it  is  increasing,  continuous  from  below,  continuous  from  above 
on  closed  sets,  and  satisfies  the  conditions  v  (  <f>  )  =  0, 
v  (  A  U5)+u(A  D  B  )<  v  (  A  )  -) -  v  (  B  ).  Suppose  now  that  M  is  the  class  of 
measures  on  (  f l  ,  B  )  and  m  E  M  is  any  such  measure.  Consider  the  uncertainty  class 
which  is  determined  by  the  2-alternating  capacity  v  as  follows  [compare  with  (2)]: 

Mv  =  JmCAf  |  m(A)<v  (A  )  ,  V A  EB  ,  m  (  D  )  =  v  (  Q  )  j.  (4) 

When  fi  is  compact  several  popular  uncertainty  models  like  e-contaminated  neighborhoods 
[5],  total  variation  neighborhoods  [5],  band  classes  [6]  and  p-point  classes  [7]  are  special  cases 
of  this  model. 

Example:  The  e-contaminated  model  [5] 

Me  =  EM  |  m  (A  )  =  (1  -  e)  m°(A  )  +  em(A),  V  A  E  B ,  m°(0)  =  m  (fl)|,  (5) 

for  e  E  [  0,  1  ].  Then  v  (  A  )  =  (l-e)m°(A  )  +  e  m°  (  Q  ) 

Fundamental  properties  of  these  uncertainty  models  have  been  studied  by  Huber  and 
Strassen  [9],  We  will  state  the  relevant  properties  as  a  Lemma. 

Lemma  1:  Suppose  v0  and  Uj  are  2-alternating  capacities  on  (  Cl  ,  B  )  and  M0  and  Mj  are 
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the  uncertainty  classes  determined  by  them  as  in  (l).  Then  there  exists  a  Lebesgue- 
measurable  function  irv  :  tt  — ►  [  0,  oo  ]  such  that 

0  v0  ({t r„  >  0})  +  uj  ({t tv  <  0})  <  6  v0(  A  )  +  vx(  Ac  )  (6) 

for  all  A  €  B  and  all  9  >  0.  Furthermore  there  exist  measures  (  m0  ,  m  j  )  in  M0  X  M1 

such  that 

(  {  *v  >  6  }  )  =  vo  (  {  >  0  }  )  (7) 

™  l  (  {  <  0  }  )  =  vl  (  {  ttv  <  0  >  )  (8) 

(that  is,  nv  is  stochastically  largest  over  M0  under  m0  and  stochastically  smallest  over  M j 
under  m x)  and  7rv  is  a  version  of  dm1/dm0  and  is  unique  a.e.  [  m0  ].  The  measures 
(  m0  ,  nil  )  are  termed  the  least-favorable  measures  over  M0XM1. 


Example:  The  e-contaminated  mixture  uncertainty  classes  described  by 
Mj  —  |  irij  6  Af  |  my  =  (1  -  ey )  my°  +  ey  my  ,  my  (ft)  =  j  ,  j  =0, 


(9) 


associated  with  the  2-alternating  capacities 


vj  {A  ) 


(  1  -  ey  )  my°  (  A  )  +  €y  ,  A  ^  <f> 

0  ,  A  =  <j>  ’ 


have  the  least-favorable  distributions 


(10) 


(1 

“  €0 

)  dm  0° 

/d\  , 

dm  i  /  dm  0° 

<N 

VI 

dm  o  /d  \  =  t 

1 

—  €n 

c2 

-  dm  i 

/d\  , 

c2  <  dm f 

/  dm  0° 

(1 

"«1 

)  dnii 

/d\  , 

Ci  <  dm 

i  /dm $ 

din  i/d  \  —  ‘ 

I  Cl 

(1- 

-  €i  )  dm  o°  / d  \ 

,  dm  i  /  dm 

0°  <  ci 

(11) 


(12) 
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and  the  Huber-Strassen  derivative  n„ 


7r„  —  dm  J dm0  =  - —  min  <  c2  ,  max  (  c  x  ,  dm  x°  /dm0°  )  1 

1_e°  l  J 

where  0<cx<c2<oo  are  such  that  m1(fi)  =  rh0(O)  =  1. 


(13) 


Let  us  now  return  to  the  hypothesis  testing  problem  (1).  Assuming  that  the  a  priori 
probabilities  for  the  hypotheses  H0  and  H1  are  X  and  1— X,  respectively,  and  that  likelihood 
ratio  tests  are  employed,  the  average  cost  is 

J(L^,LtKvvV2)  =  X{e  (2C2)>»?2})] 

+  (/  -2e  )mfi)  ({!/»)  ({L^  (JC2)>»?2})} 

+  (1-X){e  {m^aLtHX^Vi})  +  {{L^  {XS)<V 2})] 

+  (/  -2e  )m ft )  ({£  }  (2C1)<il1})-m ({£ >  QCa)<^.»> 

(14) 

In  (14)  mfy  are  the  71-th  order  extensions  of  the  probability  measures  m;-,'  and  characterize 
the  observations  2Q  =  (A\,-  ,X2i  ,...,Xn  )  of  the  * -th  decision-maker  (i  =1,2)  under 

hypothesis  Hj  ( j  =0,  1).  By  L,-(n)(2Q)  =  (dm  ft*/ dm  $  )(X{)  =  fj  (dmi,i/dmo,i)(^t,i) 

1=1 

we  denote  the  likelihood  ratio  based  on  JQ  of  the  * -th  decision-maker  and  by  77,-  its  thres¬ 
hold. 

The  optimal  thresholds  for  (14)  are  the  pair  (t)1,tj2)  which  minimizes  the  average  cost 
function  J(L  /”)  ,L^n)  ,~t/i,~t)2),  that  is 


(Vi,V2)  =  ar9  min  /(L/n)  ,L2(n)  ,j/x,i?2) 
V\,V2 


(15) 


Actually  the  likelihood  ratio  tests  (LRTs)  are  the  optimal  policies  for  the  two-decision¬ 
maker  problem  formulated  above  as  stated  in  the  following  proposition 


Proposition  1:  Likelihood  ratio  tests  (LRTs)  with  thresholds  which  minimize 
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,/<!")  ,J?x,»?2)  °f  (14)  are  optimal  over  all  tests  for  the  aforementioned  common  cost 
structure 

Proof:  The  proof  follows  closely  the  corresponding  proof  of  [l]  about  the  optimality  of  the 
one-detector  strategy  (i.e.,  the  likelihood  ratio  test)  in  this  case  of  decision  makers  with 
independent  observations,  and  will  be  ommitted. 

n.B  Robust  Distributed  Block  Detection 

The  expression  for  the  average  cost  function  in  (14)  is  valid  for  the  case  that  there  is  no 
uncertainty  in  the  statistics  of  the  observations  of  the  two  decision  makers.  In  the  presence 
of  uncertainty  within  the  2-alternating  classes  Mj  ,•  of  (1),  the  likelihood  ratios  L^n ^  and  the 
thresholds  »),• ,  i  —  1,  2,  which  are  matched  to  the  least-favorable  measures  thj  ,•  (singled  out 
by  Lemma  1)  of  the  classes  Mj  ,•  are  employed.  In  this  case  the  average  cost  function 
under  mismatch-that  is,  when  the  statistics  of  the  observations  are  actually  governed  by 
-is  given  by  J(Ljn^  ,L  ^  ,^1,^2)  which  is  obtained  from  (14),  if  we  replace 
by  Li(n)  and  77,  by  ?),• ,  for  i  —  1,  2,  and  these  thresholds  are  the  solution  to  the  minimiza¬ 
tion  problem: 

{nM  =  ar9  min  ,L^n)  SlM  »  (16) 

where  J(L  ,L ~rj  1,~r)2)  is  the  average  cost  when  the  likelihood  ratios  (1  —  1,  2  for 
the  two  detectors)  are  employed  and  the  observations  are  distributed  according  to  rhj  ,• 

( j  —  0,  1  for  the  two  hypotheses). 

Lemma  1  provides  the  robust  test  and  the  least-favorable  distributions  for  the  one¬ 
dimensional  (single  observation)  case  and  a  single  detector.  For  the  case  of  n  independent 
identically  distributed  (i.i.d)  observations,  we  denote  by  j  —  0,  1  the  measures  on 

(fi",Z?n)  which  are  the  n-th  order  extensions  of  the  measures  m;-  £  Mj  of  the  classes 
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defined  in  (4)  and  by  m-'n  ^  j  =  0,  1  the  n  -th  order  extensions  of  the  measures  rhj  singled 
out  by  Lemma  1.  Then,  the  following  result  holds 


Lemma  2:  For  any  threshold  i)  >  0  and  any  decision  statistic  g^n^: 

mtHiL^XX^v})  <  m^({L^(X)>V})  <  «„(")({?(")(!)>,})  (17) 

m^({L^](X)<V})  <  m^({Hn\X)<v})  <  mln)({9(n)(X)<v})  (18) 


Where  L^(X)  =  -^4y(X)  fl 


dm 


dm, 


i=i 


dm0 


— (X/)  is  the  likelihood  ratio,  X;  6  S~2  is  the  /-th 


observation  and  X  —  (X^Xz,  •  •  •  ,X„)  &  Cln  .  Equations  (17)  and  (18)  imply  that  the  test 
based  on  is  minimax  robust. 

Proof:  For  i.i.d.  observations  this  Lemma  was  first  proved  in  [6,  section  4}  and  [9];  very 
recently  a  more  straihtforward  proof  was  given  in  [11], 


For  the  case  of  two  detectors  and  uncertainty  within  2-alternating  capacity  classes  the 
following  result  holds: 


Proposition  2:  The  LRTs  based  on  the  least-favorable  pairs  of  distributions  (ra0,i>™i,t)  in 
the  classes  (M0 ),  i  =  1,2  (for  the  two  detectors)  are  minimax  robust  with  respect  to 
the  average  cost  function  defined  in  (14),  that  is 


J{t[n)Xin)AM  <  H£}n)Mn),nM  <  H9i{n),9t],vi,ri2) 


(19) 


where  g^n 


(t  =  0,  1)  is  any  decision  statistic  operating  on  the  observations  X,- . 


Proof:  The  right-hand-side  inequality  in  (19)  is  a  straightforward  application  of  Proposition 
1  to  the  case  characterized  by  riij  ,•  (j  —  0,1  and  t  =  1,  2)  for  which  the  likelihood  ratios 
are  L^n  ^  and  the  optimal  thresholds  are  77,- . 
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The  left-hand-side  inequality  in  (19)  is  a  consequence  of  Lemma  2.  Specifically  we  apply 
the  left-hand-side  inequalities  in  (17)  and  (18)  to  the  probability  measures  (m oi,  m  0 ,- )  and 
(mi,-,  m !,-),  respectively,  of  the  two  detectors  (j  =  0,  1),  and  then  use  the  definitons  of  the 
mismatch  average  cost  function  J  and  the  average  cost  function  J  matched  to  the  least- 
favorable  pair  of  probability  measures  (m0  ,  mi,). 

Note:  The  optimal  thresholds  (f)i,i)2)  can  be  determined  from  the  error  probabilities  a,-  fa 
(*  =1,2)  for  the  least-favorable  case  of  problem  (1)  by  minimizing 

min  j  X  [e  (&x  +  a2)  +  (/  -2e  J&j&j  ]  +  (1-X)  [e  (fa  +  fa)  +  (/  -2e  )fafa 
under  the  constraints  fa  =  fi(oti)  [operating  receiver  characteristic  (ROC)  for  detector  i], 
0 <&,•  <1,  0<fa  <1,  and  a,-  +  fa  <1  for  *  =1,2. 

II.C  Asymptotic  Performance 

We  will  need  the  following  two  Lemmas  which  are  concerned  with  the  Chernoff  upper 
bounds  on  the  error  probabilitties  of  hypothesis  testing  problems  in  the  presence  of  uncer¬ 
tainty  within  2-alternating  capacity  classes: 

Lemma  3:  Suppose  that  in  the  presence  of  uncertainty  about  the  statistics  of  the  aforemen¬ 
tioned  i.i.d.  observations  X  —  (XltX2,---,X„)  we  employ  a  likelihood  ratio  test  based  on 
defined  above,  then  the  error  probabilities  of  the  hypothesis  testing  problem  of  Hx 
versus  H0  can  be  upperbounded  by  the  Chernoff  bounds: 

m0(n)  {Z(n)(X)>7i7}  <  exp{-n  [s  7  -f  C'0(s,L)]}  (20) 

m/")  {Z^”)(X)<n 7}  <  exp{-n  [-57  +  ,L  )]}  (21) 

where  f?  =  n  7  is  the  threshold,  L  —  dm1/dth0,  and  for  all  s  £  (0,1)  the  Chernoff  dis¬ 
tances  Cj  ( s  ,L  )  are  given  by 
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C0(s  ,L 

)  =  -ln  E0{L  *} 

(22) 

C^sX 

)  =  -\n  £,{£-'}  . 

(23) 

In  (22)-(23)  the  expectations  are  with  respect  to  the  probability  measures  mQ  and  m  x,  respec¬ 
tively. 

Proof:  See  [10]. 

Lemma  4:  As  the  number  of  observations  increases  the  Chernoff  bounds  of  (20)-(21)  con¬ 
verge  exponentially  to  zero  for  all  probability  measures  m;-  j  —  0,  1  belonging  to  uncer¬ 
tainty  classes  of  the  form  (4). 

Proof:  See  [10]. 

Note:  Lemmas  3  and  4  are  also  valid  for  discrete-time  stationary  Gaussian  observations 
with  spectral  uncertainty  determined  by  2-alternating  capacity  classes;  see  [10]  for  details. 

The  following  proposition  provides  the  desired  asymptotic  result  for  the  mismatch  aver¬ 
age  cost  function  ,L^n^  ,771,7)2)  35  the  number  of  observations  n  increases: 

Proposition  3:  Under  the  assumptions  of  Proposition  2,  the  average  cost  function  under 
mismatch  converges  to  zero  exponentially  as  the  number  of  observations  increases,  despite 
the  uncertainty;  that  is,  J{L^n^  ,L^"^  ,7/1,772)— *-0,  as  n  — >oo  for  all  probability  measures  mj  ,• 
in  the  uncertainty  class  Mjj  given  by  (1). 

Proof:  By  applying  Lemma  3  to  the  error  probabilities  of  the  hypothesis  testing  problem  of 
each  of  the  two  detectors  and  using  the  definition  of  J(L^n^  ,L^n'>  ,Vi>V2)  we  derive  an  upper 
bound  on  the  average  cost  under  mismatch  in  terms  of  the  Chernoff  bounds.  This  takes  the 
form 


J  {L  [n)  X  in)  A1A2)  <  \{e  [exp{-n  [« '7i+C'o,i(5  1)]}  ^  exP{-«  [s  *1z+C  0t2{s  X  2)]}] 
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+  (/  -2e  )exp{-n  [a  li+C01{s  ,Zi)]}exp{-n  [a  l2+co, 2(s  X2)]}} 

+  (1-X){c  [exp{-n  [-s  'h+C ljl(«  ,Li)}}  +  exp{-n  [-57 2+C'i,2(s  ,^2)]}] 

+  (/  -2e)exp{-n  [-s  7i+C'i,i(«  ,L  i)]}exp{-n  [-s72+^i,2(s  A)]}} 

(24) 

where  ??,  =  n  7,-  for  *=1,2,  is  the  threshold  for  the  e-th  detector,  L,  =  dmli/dmoi,  and 
for  all  «  E  (0,1)  the  Chernoff  distances  Cj  i{s,Li )  for  J  —  0,  1  are  given  by 
Cq  i  (s  ,L{ )  =  -In  Eqj  {L*}  and  ,•  (s  ,L{ )  =  -In  E^i  {Z^-8},  where  the  expectations  are 
with  respect  to  the  probability  measures  mQi  and  mlit  respectively.  Finally  we  apply 
Lemma  4  to  (24)  to  complete  the  proof  of  Proposition  3. 
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m.  MINIMAX  ROBUST  DISTRIBUTED  SEQUENTIAL  DETECTION 

m.A  Problem  Formulation 

The  distributed  sequential  detection  problem  that  we  consider  in  this  section  has  a  lot 
of  sismilarities  with  the  problem  considered  in  the  previous  section.  The  two  decision  makers 
are  faced  with  the  same  hypothesis  testing  problem  described  in  (1),  where  the  uncertainty 
classes  of  (2)  and  the  cost  function  of  (3)  remain  the  same.  However,  now  there  is  also  a  cost 
for  collecting  data,  which  for  the  t-th  decision  maker  (i  =  1,  2)  is  defined  by: 

o,»  {Nt  }  +  (l-X)E  i  ,-  {JV,  }],  (25) 

where  kf  (f  =  J,  2)  are  nonnegative  penatanta,  denote*  expeetation  with  reepeet  to  the 
probability  measure  rrij  ,•  (under  the  hypothesis  Hj,  j  —  0,  I,  and  for  the  j-th  detector, 
i  =  1,  2),  the  a  priori  probabilities  for  the  hypotheses  H0  and  H1  are  X  and  1-X,  respec¬ 
tively,  and  the  random  variable  iV,-  is  the  (discrete)  stopping  time  (sample  size)  of  the  i-th 
detector;  i.e.,  the  number  of  samples  necessary  in  order  to  reach  a  decision  in  favor  of  one  of 
the  two  hypotheses. 

Recall  [12]  that  in  the  sequential  detection  of  a  single  detector,  the  optimal  test,  termed 
the  sequential  probability  ratio  test  (SPRT),  consists  of  keep  sampling  till  the  likelihood  ratio 
L based  on  n  samples  of  the  observations  exceeds  B  or  falls  below  A --the  two 
thresholds— in  which  case  a  decision  is  made  in  favor  of  Hx  or  H0)  respectively. 

Assuming  that  SPRTs  are  employed  by  both  detectors,  we  can  write  the  average  cost  as 

J(l}Ni)  ,l^2)  ,a  1:b1:a2,b2) 

—  x/ k1E0  l{Nl  |  Li)  +  k2  E02{N2  j  L2} 
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+e  [mdi^  ({Z,  i  !})  +  m^i2H{L2  2  (X2)>-#2})] 

+(/  -He  )m,j*>  ({£  f'1  (V  ,)>■&, })-»,0ip  ({L f1’  (X2)>B2})J 

+(1-X)/*1£71>1{/V1  |I,}  +  t2  E,,2{AT2  \L,J 

+« h/,;»  ({i,(",lUi)<A  ,j)  +  »,<;>  ({if*1  (x2)<a2})| 

+(/  -2e  )m  (,;>  ({i  f'1  (£,)<  A  ,})>»  £;>  ({if  •>  (*,)<*  ,})  J  (26) 

where  N{  (i  =  1,  2)  are  (discrete)  stopping  times  for  the  two  detectors,  that  is,  if  L^n\Xi), 
which  is  based  on  the  n  observations  X,- ,  is  larger  than  or  equal  to  £?, ,  it  is  decided  that  H1 

is  true,  the  test  terminates  and  A/)  =  n  ;  if  it  is  smaller  than  or  equal  to  A{ ,  it  is  decided 
that  H0  is  true,  the  test  again  terminates  and  jV)  =  n ;  otherwise,  one  more  sample  (observa¬ 
tion)  is  collected  and  the  procedure  continues,  m^p  is  the  probability  measure  which  governs 
the  observations  of  the  f-th  detector  under  hypothesis  Hj  ( j  =0,  1)  when  the  SPRT  ter- 

(JV ) 

minates  after  A/,-  samples,  L,-  1  (X  )  —  I1X'(X,»)  is  the  likelihood  ratio  of  the  t-th  detec- 

{=i 

tor  based  on  the  TV)  samples  of  the  i.i.d.  observations  X»  =  {Xi,i  ,X2,i  ,--.)Xv,,»), 

j LAX,  ,  )  —  m  1’*-  (X/  i )  is  the  likelihood  ratio  for  one-sample,  and  0  <  A{  <  1  <  B{  are 
’  dm  o,» 

the  two  thresholds  for  the  SPRT  of  detector  i.  The  notation  Ej^{N^  \  }  has  been  pre¬ 

ferred  over  the  notation  Eji{Ni}  for  the  expected  value  of  N{  under  probability  measure 
mj  ,•  and  an  SPRT  employing  the  likelihood  ratio  Lp ^  —  dm^p/dm&"\  because  it  allows 
us  to  consider  situations  of  mismatch,  that  is,  when  the  likelihood  ratio  employed  is  not  the 
one  corresponding  to  the  operating  probability  measures. 
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The  optimal  thresholds  for  (26)  are  the  quadruple  (A  i,Blt A  2,# 2)  which  minimizes  the 

(N)  (TV  )  ~ 

average  cost  function  J(L  x  1  ,L2  2  ,A  !,£?!, A  2^2),  that  is 

(A1,Bl,A2,B2)  =  arg  _  min  _  J{L  fNl)  ,L2Ni)  ~A  hB  VA  2,B2)  (27) 

Actually  the  sequential  probability  ratio  tests  (SPRTs)  are  the  optimal  policies  for  the  two- 
decision-maker  problem  formulated  above  as  stated  in  the  following  proposition 

(N  )  (N  )  ~ 

Proposition  4:  SPRTs  with  thresholds  which  minimize  J[L  1  1  ,L  2  2  ,A  i,B  i,A  2,5  2)  of 
(26)  are  optimal  over  all  tests  for  the  aforementioned  common  cost  structure. 

Proofs  The  proof  is  provided  in  [2]  for  discrete-time  sequential  detection  and  in  [3]  for 
continuous-time  sequential  detection  and  it  establishes  the  optimality  of  the  one-detector 
strategy  (i.e.,  the  SPRT)  in  this  case  of  decision  makers  with  independent  observations.  It 
will  be  ommitted. 

m.B.  Robust  Distributed  Sequential  Detection 

The  expression  for  the  average  cost  function  in  (26)  is  valid  for  the  case  that  there  is  no 
uncertainty  in  the  statistics  of  the  observations  of  the  two  decision  makers.  In  the  presence 
of  uncertainty  within  the  2-alternating  classes  My  of  (l),  the  likelihood  ratios  and  the 
thresholds  (A,- ,5,- ),  1  =  1,  2,  which  are  matched  to  the  least-favorable  measures  rhji  (sin¬ 
gled  out  by  Lemma  1)  of  the  classes  Mj  f  are  employed.  In  this  case  the  average  cost  func¬ 
tion  under  mismatch— that  is,  when  the  statistics  of  the  observations  are  actually  governed 

by  rrij  i  €  My  , -is  given  by  J(L  A  i,#i  ,A  2, #2)  which  is  obtained  from  (26),  if  we 

replace  by  L and  (A,-  ,5,  )  by  (A,- ,5,  ),  for  i  =  1,  2.  These  thresholds  are  the  solu¬ 


tion  to  the  minimization  problem: 
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(A  hB  hA  2iB  2)  —  arg 


y,f  (W,)  f  (N2)  ~  ~  ~  ~ 

«  nun  _  *^(^1  ,£2  2>^ 2)  j 

A  y,B  j,j4  2)^  2 


(28) 


where  J(Zj  ^  X2  ^  ,-A  !,£?!, .A  2,^2)  *s  the  average  cost  when  SPRTs  based  on  the  likelihood 

ratios  l}  and  the  thresholds  (.A,-,#,)  (i  =  1,  2  for  the  two  detectors)  are  employed  and 
the  observations  are  distributed  according  to  my>t-  (j  —  0,  1  for  the  two  hypotheses). 

For  sequential  detection  the  following  result  also  holds 


Lemma  5:  For  i.i.d.  observations  with  probability  measures  belonging  to  uncertainty  classes 
of  the  form  (4),  the  sequential  probability  ratio  test  (SPRT)  based  on  the  likelihood  ratio 
L^n\X.)  defined  in  Lemma  2  above  and  on  the  thresholds  A  and  M  ( A  <  1  <  )  is 
minimax  robust  for  the  error  probabilities;  that  is,  it  satisfies  the  equations 

rocPa^to^})  <  mo^tt^to^})  <  m0(*)({flr^)(X)>JB})  (29) 

^/^({LWQO^A})  <  (30) 

In  (29)-(30)  N  is  a  stopping  time  for  the  SPRT.  The  measures  mj'*^  and  mj'*^  for  j  =  0,  1 
are  the  multi-dimensional  extensions  of  the  original  measures  rrij  and  rhj,  respectively,  which 
are  induced  by  the  stopping  time  N  defined  above.  Finally,  g(n)  is  any  other  decision  statis¬ 
tic,  based  on  the  n  observations  X,  which  could  be  used  in  the  aforementioned  sequential 
test  instead  of  the  likelihood  ratio. 

Proof:  See  section  5  of  [4]. 

Before  stating  and  proving  the  basic  result  of  this  section  we  need  to  prove  the  follow¬ 
ing  Lemma  about  the  expected  values  of  the  sample  sizes  (stopping  times)  of  the  SPRTs 
under  mismatch: 

Lemma  6:  Let  Ej ,■  [TV,-  |  L{  ]  represent  the  expected  sample  size  of  the  SPRT  of  the  z'-th 

detector  under  hypothesis  Hj  based  on  the  likelihood  ratio  L^n\2 Q)  =  nA'PQ,»')>  where 

1=1 
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„  „  dm  i  ,•  ^ 

Li  (Xj  { )  =  — -  '  -(X[  ,■ ),  and  on  the  thresholds  when  the  i.i.d.  observations  are  dis- 


dm 


0,i 


tributed  according  to  the  probability  measure  mj  ,• .  Then 


Eo,iiNi  I  A) 
EiANi  I  A} 


cu(a,-  ,at- ) 

£<U  {-In A  } 

Afii  >&i  A ) 

Ehi  {In  A  }  ’ 


(31) 

(32) 


where 


u>(x  ,y  ,x)  —  ( l-x  )  In  -AA  +  x  In  — —  , 

y  i -y 


(33) 


a,-,  /?,•  are  the  mismatch  error  probabilities  for  the  2-th  detector  under  hypotheses  H0  and 
Hh  respectively,  given  by 


a,-  =m0(;)({A(;V,)>^}) 


(34) 


and 


ft  (M) 

while  and  /?,•  are  the  corresponding  matched  error  probabilities,  which  are  given  by 


&i  =  ™KX{Li{N,)>B}) 


(36) 


and 


k  —  m\*iX{Li'N') <.A  })  . 


(37) 


Proof:  We  prove  only  (31);  (32)  can  be  proved  in  a  similar  way.  We  write  two  different 
expressions  for  2?0  ,•  {In//  ’ \Xj )}: 

£’„,i|lnL,('V',(i)J  =  A  W,()|  =  £»,i{  Eil"i(W,,)J 
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E0,i  {  ~E0,i  {  I>A-  (Xi )  i  A',  }  | 
E0,iiNi  l4}^o,0nA}, 


(38) 


where  i  denotes  expectation  with  respect  to  the  measure  moi  governing  the  observations, 
whereas  Eo  i  denotes  expectation  with  respect  to  the  measure  induced  by  the  stopping-time 
variable  N( ;  since  Eoi  {/nZ,-  }  is  a  constant,  it  can  be  pulled  out  of  the  expectation  i?0,-  {TV,-  •} 
in  the  equation  proceeding  the  last  one  in  (38)];  and 

£0,,|lnZl(N‘)(X,)|  «  In  B  +  <  i  j  In  A 


,  1-&  ,  n  m  & 

—  a,-  In—; - b  (1— or,- )  In- — — 

a,-  1-a,- 


=  ~u{®i  A  ,(*i  )• 


(39) 


In  deriving  (39)  we  used  the  definition  of  the  SPRT,  the  definitions  (34)-(35),  the  Wald’s 
approximations 


Bi 


1  -fit 


a. 


and 


A 


Pi 


l-«. 


(40) 


(41) 


and  neglected  the  oveshoot  phenomena  [12],  Then  (31)  follows  from  (38)  and  (39). 
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For  the  case  of  uncertainty  within  2-alternating  capacity  classes  the  following  result 
holds: 


Proposition  5:  The  SPRTs  which  employ  thresholds  (A,-  ,B( )  and  a  likelihood  ratio 
defined  as  in  Lemma  6  which  is  based  on  the  least-favorable  pairs  of  distributions 
(rh0i  ,m  1)t- )  in  the  classes  (M0f,Mlt),  i  =  1,2  (for  the  two  detectors)  are  minimax  robust 
with  respect  to  the  average  cost  function  defined  in  (26),  that  is 

J(L  lNl)  ,Z2(;V2),i1,B1,i2,52)  <  ,L  2  ^  ,A  i,B  1;A  2,B  2)  <  J(giNl]  ,92Ni)  Ai,B1,A2, 

(42) 

(N  ) 

where  '  ( i  =  1,  2)  is  any  decision  statistic  operating  on  the  observations  2Q ,  if  for 

t  =  1,  2  «,•  and  /?,•  of  (36)-(37)  satisfy  the  following  condition: 


l-fti  *  a,-  /?,- 

In — —  »  -/?,-ln- 


Ot; 


(1-df  )(!-/?,•) 


(43) 


Proof:  To  prove  the  right-hand-side  inequality  in  (42)  we  only  need  to  use  Proposition  4  for 
the  optimality  of  the  one-person  stategies  (the  SPRTs).  To  prove  the  left-hand-side  inequal¬ 
ity  in  (42)  it  suffices,  because  of  the  definition  of  J(L /  1  ,L{  ltB  ltA  2,B2)  to  show  that 
for  j  =  0,  1  and  i  =  1,  2 

Ej, ,{N,  /,!  •'  A*;.,*  i  \  |4  }  («) 

and 

mo(;)({LW)(i)>A})  <  m0(;)({L(;V')(Z,)>A})  (45) 

mffl{£W(2&)<^})  <  7n<'\{£W(X)<A,-})  .  (46) 

Since  (45)  and  (46)  follow  from  an  application  of  Lemma  5  to  the  robust  SPRT  of  the  * -th 
detector,  we  only  need  to  prove  (44)  in  order  to  complete  the  proof  of  (42).  Next,  we  prove 


B2) 
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(44)  for  j  =  1;  a  similar  proof  holds  for  j  =  0.  We  write 


ElANi  1 4- }  = 


o  ,  4  4  ,  ,  l~Pi 

j3 \  In - z— +ln  - 


Ehi{\nLi}  [  (!-«,• )(!-/?,)  af 


1-4 


£i,»{ln4}  4 


-In 


< 


-In 


i-4 

Ei,i{\nLi}  a, 

1 


E  i,i  {In  A  } 


(i-4 )( i-4)  4 


14} 


(47) 


In  proving  (47)  we  successively  used  the  definition  (32)  of  Lemma  6,  condition  (43),  the  ine¬ 
quality  Elfi  {InZ,- }  >  {InL,- }  (i  =  1,2)  which  follows  from  the  stochastic  dominance 
property  (8)  of  Lemma  1  when  applied  to  the  increasing  function  ln(  )  and  the  probability 
measures  mlt-  and  rh  l  t- ,  condition  (43)  again,  and  the  definition  (32)  for  the  matched  case 

*»!,<  =  *»!,<• 


Note:  The  optimal  thresholds  (A,' ,4)  (*  =  1,  2)  can  be  determined  form  the  Wald’s 
approximations  (40)-(41)  where  the  error  probabilities  a,-  and  4  are  solutions  to  the  minimi¬ 
zation  problem: 


min /X  It ,  +  « (A1+&2)  +  (/  -2«  )*,&„ 


lin  | ; 


+  (1-^) 


0,1  {— ln-£  l}  ^0,2{-^n-^  2} 

w(^i,aiA)  ,  ,  “>(4,«2>4) 


1  n 


-I-  k 


E  l,2'Un-l'  2} 


+  e  (4+4)  +  (/  -^e  )44 


]} 


under  the  constraints  0  <  a,  <  1,  0  <  4  5:  1>  an<l  4  +4  <  1  f°r  1  —  1>  2. 


m.C  Asymptotic  Performance 
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III.C  Asymptotic  Performance 

The  following  proposition  provides  a  result  on  the  asymptotic  speed— which  is 
defined  as  the  sum  of  the  asymptotic  (for  small  error  probabilities)  stopping  times  of  the 
two  detectors— of  the  robust  sequential  test. 


Proposition  6:  Suppose  that  for  the  problem  (1)  with  the  uncertainty  classes  (2)  and 
under  the  mismatch  conditions  of  Proposition  5  above,  the  error  probabilities  a,-  and  /?,- 
(for  i=l,2)  approach  zero.  Then  the  sum  of  the  asymptotic  expected  stoppong  times— 
under  mismatch  and  for  the  least-favorable  case— satisfy  the  inequalities 


-InA 


-InA . 


2}  J 


■  InB  1  InB  2 

+  (1-X)  - K - h  k2 - - — 

I  Etil{lnLt}  2  Eli2{lnL  2} 


<  X 


-InA  x 


-InA 


+ (1-X) 


InBy  InB  2 

Ex  l{lnL  1}  El2{lnL2}  J 


<  X  [*i£0J{JVi  I  <?i>  + 2(^2 1  G2}  ]  +  (1-X)  [t.&jtAT,  !<?,}  +  kS.ls{N,  I  a2] 


(48) 

where  Gx  and  G2  are  any  other  sequential  tests  different  from  the  SPRT;  Ej  ,•  denotes 
the  limit  of  the  expectation  Ej as  a,  — >0  and  /&,  — *■0. 

Proof:  As  &,  -+0  and  /?,  — ►  0,  then  a,-  and  /?,■  approach  zero  as  well,  since  a,-  <a,-  and 
0i<0i  .  Thus  J  (L  X,L  2>A  \,B  hA  2,B  B  (under  mismatch)  reduces  to  the  first  sum  in 
(48),  whereas  /(Z1,Z2,A  hB  hA  2,B 2)  reduces  to  the  second  sum  in  (48).  The  first  sum  is 
smaller  than  the  second  sum  since  E o  i  }  >  Eo  i  {-/nX,- }  and 

Ei  i  {InLi  }  >  Eli  {InLi  }  for  *  =1,2  because  of  the  stochastic  dominance  inequalities  (7) 
and  (8)  of  Lemma  1.  The  second  inequality  (48)  holds  because  of  a  theorem  by  Wald 
[12]  (for  the  matched  single  detector  case)  which  states  that  the  SPRT  has  the  minimum 
asymptotic  speed  (expected  stopping  time)  among  all  sequential  tests. 
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V.  CONCLUSIONS 

In  this  paper  we  considered  two  detectors  making  independent  observations  and 
trying  decide  which  one  of  two  hypotheses  is  true.  Both  fixed-sample-size  (block)  detec¬ 
tion  and  sequential  detection  were  considered.  The  decisions  were  coupled  through  a 
common  cost  function  which  for  fixed-sample-size  tests  consisted  of  the  sum  of  the  error 
probabilities  while  for  sequential  tests  it  comprised  the  sum  of  the  error  probabilities  and 
the  expected  sample  sizes.  The  probability  measures  which  govern  the  statistics  of  the 
i.i.d.  observations  belonged  to  uncertainty  classes  determined  by  2-alternating  capacities. 

We  were  able  to  derive  minimax  robust  (worst-case)  designs  according  to  which  the 
two  detectors  employ  fixed-sample-size  tests  or  sequential  probability  ratio  tests  whose 
likelihood  ratios  and  thresholds  depend  on  the  least-favorable  probability  measures  over 
the  uncertainty  class  (actually,  the  Huber-Strassen  derivative  and  the  least-favorable  ele¬ 
ments  of  the  2-alternating  capacity  class).  For  the  aforementioned  cost  function  the 
optimal  thresholds  of  the  two  detectors  turn  out  to  be  coupled.  It  was  shown  that, 
despite  the  uncertainty,  the  two  detectors  are  guaranteed  a  minimum  level  of  acceptable 
performance.  In  the  case  of  block  detection  it  was  also  shown,  via  Chernoff  bounds,  that 
for  the  aforementioned  robust  likelihood  ratio  test  the  two-detector  cost  function 
decreases  exponentially  to  zero  as  the  number  of  observations  increases  for  all  elements 
in  the  uncertainty  class. 

The  results  of  this  paper  can  be  extended  to  several  directions.  First,  they  can  be 
extended  to  situations  of  distributed  detection  where  the  two  detectors  are  still  making 
independent  observations  but  the  obseravtions  for  each  detector  are  not  i.i.d  (they  could 
be  stationary  Gaussian  with  spectral  uncertainty,  first-order  Markov  with  uncertainty  in 
the  transition  probabilities,  or  more  generally  dependendent  observations).  Second,  we 
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can  formulate  and  solve  similar  problems  in  continuous-time  (see  [4]).  Third,  we  can  for¬ 
mulate  and  investigate  problems  of  data  fusion  from  distributed  sensors  in  uncertain 
environments.  Finally,  we  should  relax  the  assumption  of  independent  observations  for 
the  two  detectors  and  formulate  and  attempt  to  solve  similar  problems  for  the  case  in 
which  the  observations  of  the  two  detectors  are  correlated. 
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